An Architecture for Using Tertiary Storage in a Data Warehouse
نویسنده
چکیده
In this paper, we present an architecture for a data warehouse that provides convenient, flexible, and high-performance access to tape-resident data. Data warehouses allow an organization to store and analyze operational data. Many data warehouses (e.g., for telecommunications) create very large data sets that can be economically stored only on tertiary storage. At the same time, data analysts need to make a wide variety of decision support and data mining queries on the data. While one can (and typically does) extract compact summaries of the warehoused data, many queries can only be answered by using data from the full data set. Using database systems to access large tape resident tables of small objects is conventionally viewed as a difficult problem, because it is easy to express an query that takes a very long time to process (e.g., the join of two tape-resident tables). Fortunately, decision support queries can usually be expressed in a way that can be implemented efficiently on tape-resident data. In addition, many data mining algorithms have been optimized to make a small number of passes over a data set. We present the features of a tape-based data warehousing system that provides efficient support for data mining and decision support queries on very large collections of small objects, a prototype implementation, and preliminary performance results.
منابع مشابه
بهبود الگوریتم انتخاب دید در پایگاه داده تحلیلی با استفاده از یافتن پرس وجوهای پرتکرار
A data warehouse is a source for storing historical data to support decision making. Usually analytic queries take much time. To solve response time problem it should be materialized some views to answer all queries in minimum response time. There are many solutions for view selection problems. The most appropriate solution for view selection is materializing frequent queries. Previously posed ...
متن کاملScheduling Queries on Tape-resident Data
Advances in storage technology have made near-line tertiary storage a viable alternative for database and data warehouse systems. Tertiary storage systems are employed in cases where secondary storage can not satisfy the data handling requirements or tertiary storage is more cost eeective option. Tertiary storage devices have traditionally been used as archival storage. The new application doma...
متن کاملImprovement of the Analytical Queries Response Time in Real-Time Data Warehouse using Materialized Views Concatenation
A real-time data warehouse is a collection of recent and hierarchical data that is used for managers’ decision-making by creating online analytical queries. The volume of data collected from data sources and entered into the real-time data warehouse is constantly increasing. Moreover, as the volume of input data to the real time data warehouse increases, the interference between online loading ...
متن کاملChange Detection and Maintenance of an XML Web Warehouse
The World Wide Web contains a huge and increasing volume of information. The web warehouse is an efficient and effective means to facilitate utilization of information on the Web, not only to individual users but also to business organizations, especially for decision-making purposes. On the other hand, XML has recently become the new standard for representation and exchange of data on the Web....
متن کاملAn EOQ model for non-instantaneous deteriorating items with two levels of storage under trade credit policy
A deterministic inventory model with two levels of storage (own warehouse and rented warehouse) with non-instantaneous deteriorating items is studied. The supplier offers the retailer a trade credit period to settle the amount. Different scenarios based on the deterioration and the trade credit period have been considered. In this article, we have framed two models considering single warehouse ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1997